Overview

Dataset statistics

Number of variables13
Number of observations2106
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory214.0 KiB
Average record size in memory104.1 B

Variable types

Categorical3
Numeric10

Alerts

datum has a high cardinality: 2106 distinct values High cardinality
R06 is highly correlated with MonthHigh correlation
Month is highly correlated with R06High correlation
datum is uniformly distributed Uniform
Weekday Name is uniformly distributed Uniform
datum has unique values Unique
M01AB has 40 (1.9%) zeros Zeros
M01AE has 36 (1.7%) zeros Zeros
N02BA has 78 (3.7%) zeros Zeros
N02BE has 26 (1.2%) zeros Zeros
N05B has 43 (2.0%) zeros Zeros
N05C has 1430 (67.9%) zeros Zeros
R03 has 484 (23.0%) zeros Zeros
R06 has 256 (12.2%) zeros Zeros

Reproduction

Analysis started2022-02-23 07:02:11.664788
Analysis finished2022-02-23 07:02:37.356078
Duration25.69 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

datum
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct2106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
1/2/2014
 
1
11/16/2017
 
1
11/14/2017
 
1
11/13/2017
 
1
11/12/2017
 
1
Other values (2101)
2101 

Length

Max length10
Median length9
Mean length8.924026591
Min length8

Characters and Unicode

Total characters18794
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2106 ?
Unique (%)100.0%

Sample

1st row1/2/2014
2nd row1/3/2014
3rd row1/4/2014
4th row1/5/2014
5th row1/6/2014

Common Values

ValueCountFrequency (%)
1/2/20141
 
< 0.1%
11/16/20171
 
< 0.1%
11/14/20171
 
< 0.1%
11/13/20171
 
< 0.1%
11/12/20171
 
< 0.1%
11/11/20171
 
< 0.1%
11/10/20171
 
< 0.1%
11/9/20171
 
< 0.1%
11/8/20171
 
< 0.1%
11/7/20171
 
< 0.1%
Other values (2096)2096
99.5%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
1/2/20141
 
< 0.1%
1/4/20141
 
< 0.1%
1/6/20141
 
< 0.1%
1/7/20141
 
< 0.1%
1/8/20141
 
< 0.1%
1/9/20141
 
< 0.1%
1/10/20141
 
< 0.1%
1/11/20141
 
< 0.1%
1/12/20141
 
< 0.1%
1/13/20141
 
< 0.1%
Other values (2096)2096
99.5%

Most occurring characters

ValueCountFrequency (%)
/4212
22.4%
13846
20.5%
23323
17.7%
02470
13.1%
7759
 
4.0%
8759
 
4.0%
5759
 
4.0%
6754
 
4.0%
4752
 
4.0%
9663
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number14582
77.6%
Other Punctuation4212
 
22.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
13846
26.4%
23323
22.8%
02470
16.9%
7759
 
5.2%
8759
 
5.2%
5759
 
5.2%
6754
 
5.2%
4752
 
5.2%
9663
 
4.5%
3497
 
3.4%
Other Punctuation
ValueCountFrequency (%)
/4212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common18794
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/4212
22.4%
13846
20.5%
23323
17.7%
02470
13.1%
7759
 
4.0%
8759
 
4.0%
5759
 
4.0%
6754
 
4.0%
4752
 
4.0%
9663
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII18794
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/4212
22.4%
13846
20.5%
23323
17.7%
02470
13.1%
7759
 
4.0%
8759
 
4.0%
5759
 
4.0%
6754
 
4.0%
4752
 
4.0%
9663
 
3.5%

M01AB
Real number (ℝ≥0)

ZEROS

Distinct218
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.033683325
Minimum0
Maximum17.34
Zeros40
Zeros (%)1.9%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4.99
Q36.67
95-th percentile10
Maximum17.34
Range17.34
Interquartile range (IQR)3.67

Descriptive statistics

Standard deviation2.737578507
Coefficient of variation (CV)0.543851953
Kurtosis0.5900347612
Mean5.033683325
Median Absolute Deviation (MAD)1.99
Skewness0.6423125147
Sum10600.93708
Variance7.494336084
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4126
 
6.0%
5119
 
5.7%
3111
 
5.3%
699
 
4.7%
299
 
4.7%
163
 
3.0%
760
 
2.8%
5.3358
 
2.8%
3.3357
 
2.7%
2.3355
 
2.6%
Other values (208)1259
59.8%
ValueCountFrequency (%)
040
1.9%
0.21251
 
< 0.1%
0.336
 
0.3%
0.3410
 
0.5%
0.662
 
0.1%
0.675
 
0.2%
0.682
 
0.1%
0.831
 
< 0.1%
163
3.0%
1.182
 
0.1%
ValueCountFrequency (%)
17.341
< 0.1%
171
< 0.1%
16.681
< 0.1%
16.181
< 0.1%
15.331
< 0.1%
14.661
< 0.1%
14.331
< 0.1%
14.181
< 0.1%
14.011
< 0.1%
141
< 0.1%

M01AE
Real number (ℝ≥0)

ZEROS

Distinct694
Distinct (%)33.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.895830316
Minimum0
Maximum14.463
Zeros36
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0.81625
Q12.34
median3.67
Q35.138
95-th percentile7.66
Maximum14.463
Range14.463
Interquartile range (IQR)2.798

Descriptive statistics

Standard deviation2.133336599
Coefficient of variation (CV)0.5475948453
Kurtosis0.8918244314
Mean3.895830316
Median Absolute Deviation (MAD)1.34
Skewness0.7183773911
Sum8204.618646
Variance4.551125046
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
355
 
2.6%
253
 
2.5%
3.3452
 
2.5%
447
 
2.2%
2.3445
 
2.1%
139
 
1.9%
2.3338
 
1.8%
036
 
1.7%
3.3334
 
1.6%
1.3434
 
1.6%
Other values (684)1673
79.4%
ValueCountFrequency (%)
036
1.7%
0.0331
 
< 0.1%
0.0663
 
0.1%
0.1571
 
< 0.1%
0.1981
 
< 0.1%
0.2311
 
< 0.1%
0.3310
 
0.5%
0.3412
 
0.6%
0.3632
 
0.1%
0.3731
 
< 0.1%
ValueCountFrequency (%)
14.4631
< 0.1%
13.341
< 0.1%
12.7061
< 0.1%
11.991
< 0.1%
11.7451
< 0.1%
11.691
< 0.1%
11.5051
< 0.1%
11.321
< 0.1%
11.311
< 0.1%
10.761
< 0.1%

N02BA
Real number (ℝ≥0)

ZEROS

Distinct199
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.880441121
Minimum0
Maximum16
Zeros78
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0.6
Q12
median3.5
Q35.2
95-th percentile8
Maximum16
Range16
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation2.384010237
Coefficient of variation (CV)0.6143657802
Kurtosis1.187551278
Mean3.880441121
Median Absolute Deviation (MAD)1.5
Skewness0.8494966517
Sum8172.209
Variance5.683504808
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3217
 
10.3%
2212
 
10.1%
4178
 
8.5%
1147
 
7.0%
5143
 
6.8%
6104
 
4.9%
078
 
3.7%
763
 
3.0%
831
 
1.5%
3.530
 
1.4%
Other values (189)903
42.9%
ValueCountFrequency (%)
078
3.7%
0.13
 
0.1%
0.152
 
0.1%
0.28
 
0.4%
0.256
 
0.3%
0.32
 
0.1%
0.41
 
< 0.1%
0.4166666671
 
< 0.1%
0.451
 
< 0.1%
0.53
 
0.1%
ValueCountFrequency (%)
161
< 0.1%
151
< 0.1%
14.41
< 0.1%
141
< 0.1%
13.71
< 0.1%
13.31
< 0.1%
13.291666671
< 0.1%
12.71
< 0.1%
12.51
< 0.1%
12.31
< 0.1%

N02BE
Real number (ℝ≥0)

ZEROS

Distinct713
Distinct (%)33.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.9170953
Minimum0
Maximum161
Zeros26
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile10.2
Q119
median26.9
Q338.3
95-th percentile59.9375
Maximum161
Range161
Interquartile range (IQR)19.3

Descriptive statistics

Standard deviation15.59096554
Coefficient of variation (CV)0.5211390137
Kurtosis3.418982129
Mean29.9170953
Median Absolute Deviation (MAD)9.3
Skewness1.202291302
Sum63005.40271
Variance243.0782065
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1537
 
1.8%
1936
 
1.7%
2034
 
1.6%
1834
 
1.6%
2434
 
1.6%
2234
 
1.6%
1432
 
1.5%
2330
 
1.4%
1627
 
1.3%
2126
 
1.2%
Other values (703)1782
84.6%
ValueCountFrequency (%)
026
1.2%
1.21
 
< 0.1%
22
 
0.1%
31
 
< 0.1%
63
 
0.1%
6.21
 
< 0.1%
6.51
 
< 0.1%
6.62
 
0.1%
6.71
 
< 0.1%
6.81
 
< 0.1%
ValueCountFrequency (%)
1611
< 0.1%
108.71
< 0.1%
100.11
< 0.1%
97.81
< 0.1%
93.051
< 0.1%
89.31
< 0.1%
88.41
< 0.1%
88.31
< 0.1%
88.21
< 0.1%
87.21
< 0.1%

N05B
Real number (ℝ≥0)

ZEROS

Distinct77
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.853626543
Minimum0
Maximum54.83333333
Zeros43
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile2
Q15
median8
Q312
95-th percentile19
Maximum54.83333333
Range54.83333333
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.605604747
Coefficient of variation (CV)0.633142218
Kurtosis4.022887927
Mean8.853626543
Median Absolute Deviation (MAD)3
Skewness1.324840405
Sum18645.7375
Variance31.42280458
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6167
 
7.9%
7166
 
7.9%
5160
 
7.6%
4159
 
7.5%
9154
 
7.3%
8151
 
7.2%
3133
 
6.3%
11130
 
6.2%
10118
 
5.6%
12102
 
4.8%
Other values (67)666
31.6%
ValueCountFrequency (%)
043
 
2.0%
149
 
2.3%
289
4.2%
2.61
 
< 0.1%
3133
6.3%
3.53
 
0.1%
4159
7.5%
5160
7.6%
5.51
 
< 0.1%
6167
7.9%
ValueCountFrequency (%)
54.833333331
 
< 0.1%
431
 
< 0.1%
361
 
< 0.1%
333
0.1%
322
0.1%
314
0.2%
301
 
< 0.1%
292
0.1%
28.333333331
 
< 0.1%
282
0.1%

N05C
Real number (ℝ≥0)

ZEROS

Distinct20
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5935224755
Minimum0
Maximum9
Zeros1430
Zeros (%)67.9%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.09298832
Coefficient of variation (CV)1.841528106
Kurtosis8.763071582
Mean0.5935224755
Median Absolute Deviation (MAD)0
Skewness2.520467859
Sum1249.958333
Variance1.194623468
MonotonicityNot monotonic
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
01430
67.9%
1334
 
15.9%
2157
 
7.5%
3113
 
5.4%
423
 
1.1%
512
 
0.6%
0.8333333335
 
0.2%
1.255
 
0.2%
64
 
0.2%
0.4166666674
 
0.2%
Other values (10)19
 
0.9%
ValueCountFrequency (%)
01430
67.9%
0.4166666674
 
0.2%
0.6252
 
0.1%
0.8333333335
 
0.2%
1334
 
15.9%
1.255
 
0.2%
1.6666666672
 
0.1%
1.8751
 
< 0.1%
2157
 
7.5%
2.0833333334
 
0.2%
ValueCountFrequency (%)
92
 
0.1%
82
 
0.1%
72
 
0.1%
64
 
0.2%
512
 
0.6%
4.1666666671
 
< 0.1%
423
 
1.1%
3113
5.4%
2.9166666671
 
< 0.1%
2.52
 
0.1%

R03
Real number (ℝ≥0)

ZEROS

Distinct64
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.512261594
Minimum0
Maximum45
Zeros484
Zeros (%)23.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median4
Q38
95-th percentile20
Maximum45
Range45
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.428736372
Coefficient of variation (CV)1.166261119
Kurtosis4.015505418
Mean5.512261594
Median Absolute Deviation (MAD)3
Skewness1.829746242
Sum11608.82292
Variance41.32865135
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0484
23.0%
1257
12.2%
5188
 
8.9%
2173
 
8.2%
6146
 
6.9%
3119
 
5.7%
793
 
4.4%
1084
 
4.0%
473
 
3.5%
863
 
3.0%
Other values (54)426
20.2%
ValueCountFrequency (%)
0484
23.0%
0.4166666671
 
< 0.1%
1257
12.2%
1.251
 
< 0.1%
1.4166666671
 
< 0.1%
1.6666666673
 
0.1%
1.8751
 
< 0.1%
2173
 
8.2%
2.53
 
0.1%
2.9166666671
 
< 0.1%
ValueCountFrequency (%)
451
 
< 0.1%
411
 
< 0.1%
371
 
< 0.1%
361
 
< 0.1%
351
 
< 0.1%
342
 
0.1%
332
 
0.1%
316
0.3%
303
0.1%
294
0.2%

R06
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct98
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.900198243
Minimum0
Maximum15
Zeros256
Zeros (%)12.2%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile8
Maximum15
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.415815905
Coefficient of variation (CV)0.8329830247
Kurtosis1.989410302
Mean2.900198243
Median Absolute Deviation (MAD)1
Skewness1.292813236
Sum6107.8175
Variance5.836166485
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2392
18.6%
1351
16.7%
3295
14.0%
0256
12.2%
4173
8.2%
5143
 
6.8%
684
 
4.0%
749
 
2.3%
826
 
1.2%
1024
 
1.1%
Other values (88)313
14.9%
ValueCountFrequency (%)
0256
12.2%
0.13
 
0.1%
0.24
 
0.2%
0.31
 
< 0.1%
0.331
 
< 0.1%
0.341
 
< 0.1%
0.44
 
0.2%
0.52
 
0.1%
0.61
 
< 0.1%
0.6251
 
< 0.1%
ValueCountFrequency (%)
152
 
0.1%
13.51
 
< 0.1%
12.41
 
< 0.1%
12.21
 
< 0.1%
12.11
 
< 0.1%
126
 
0.3%
119
 
0.4%
10.54
 
0.2%
10.41
 
< 0.1%
1024
1.1%

Year
Real number (ℝ≥0)

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.401235
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum2014
5-th percentile2014
Q12015
median2016
Q32018
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.665060368
Coefficient of variation (CV)0.0008257584549
Kurtosis-1.221280398
Mean2016.401235
Median Absolute Deviation (MAD)1
Skewness0.04472617313
Sum4246541
Variance2.772426029
MonotonicityIncreasing
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2016366
17.4%
2015365
17.3%
2017365
17.3%
2018365
17.3%
2014364
17.3%
2019281
13.3%
ValueCountFrequency (%)
2014364
17.3%
2015365
17.3%
2016366
17.4%
2017365
17.3%
2018365
17.3%
2019281
13.3%
ValueCountFrequency (%)
2019281
13.3%
2018365
17.3%
2017365
17.3%
2016366
17.4%
2015365
17.3%
2014364
17.3%

Month
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.344254511
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.386953836
Coefficient of variation (CV)0.5338615955
Kurtosis-1.161574205
Mean6.344254511
Median Absolute Deviation (MAD)3
Skewness0.04384499493
Sum13361
Variance11.47145628
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3186
8.8%
5186
8.8%
7186
8.8%
8186
8.8%
1185
8.8%
4180
8.5%
6180
8.5%
9180
8.5%
2169
8.0%
10163
7.7%
Other values (2)305
14.5%
ValueCountFrequency (%)
1185
8.8%
2169
8.0%
3186
8.8%
4180
8.5%
5186
8.8%
6180
8.5%
7186
8.8%
8186
8.8%
9180
8.5%
10163
7.7%
ValueCountFrequency (%)
12155
7.4%
11150
7.1%
10163
7.7%
9180
8.5%
8186
8.8%
7186
8.8%
6180
8.5%
5186
8.8%
4180
8.5%
3186
8.8%

Hour
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
276
2104 
248
 
1
190
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6318
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row248
2nd row276
3rd row276
4th row276
5th row276

Common Values

ValueCountFrequency (%)
2762104
99.9%
2481
 
< 0.1%
1901
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
2762104
99.9%
2481
 
< 0.1%
1901
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
22105
33.3%
72104
33.3%
62104
33.3%
41
 
< 0.1%
81
 
< 0.1%
11
 
< 0.1%
91
 
< 0.1%
01
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6318
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
22105
33.3%
72104
33.3%
62104
33.3%
41
 
< 0.1%
81
 
< 0.1%
11
 
< 0.1%
91
 
< 0.1%
01
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common6318
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
22105
33.3%
72104
33.3%
62104
33.3%
41
 
< 0.1%
81
 
< 0.1%
11
 
< 0.1%
91
 
< 0.1%
01
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII6318
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22105
33.3%
72104
33.3%
62104
33.3%
41
 
< 0.1%
81
 
< 0.1%
11
 
< 0.1%
91
 
< 0.1%
01
 
< 0.1%

Weekday Name
Categorical

UNIFORM

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Thursday
301 
Friday
301 
Saturday
301 
Sunday
301 
Monday
301 
Other values (2)
601 

Length

Max length9
Median length7
Mean length7.141975309
Min length6

Characters and Unicode

Total characters15041
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowThursday
2nd rowFriday
3rd rowSaturday
4th rowSunday
5th rowMonday

Common Values

ValueCountFrequency (%)
Thursday301
14.3%
Friday301
14.3%
Saturday301
14.3%
Sunday301
14.3%
Monday301
14.3%
Tuesday301
14.3%
Wednesday300
14.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
thursday301
14.3%
friday301
14.3%
saturday301
14.3%
sunday301
14.3%
monday301
14.3%
tuesday301
14.3%
wednesday300
14.2%

Most occurring characters

ValueCountFrequency (%)
a2407
16.0%
d2406
16.0%
y2106
14.0%
u1204
8.0%
r903
 
6.0%
n902
 
6.0%
s902
 
6.0%
e901
 
6.0%
T602
 
4.0%
S602
 
4.0%
Other values (7)2106
14.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12935
86.0%
Uppercase Letter2106
 
14.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2407
18.6%
d2406
18.6%
y2106
16.3%
u1204
9.3%
r903
 
7.0%
n902
 
7.0%
s902
 
7.0%
e901
 
7.0%
o301
 
2.3%
t301
 
2.3%
Other values (2)602
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
T602
28.6%
S602
28.6%
M301
14.3%
F301
14.3%
W300
14.2%

Most occurring scripts

ValueCountFrequency (%)
Latin15041
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a2407
16.0%
d2406
16.0%
y2106
14.0%
u1204
8.0%
r903
 
6.0%
n902
 
6.0%
s902
 
6.0%
e901
 
6.0%
T602
 
4.0%
S602
 
4.0%
Other values (7)2106
14.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII15041
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a2407
16.0%
d2406
16.0%
y2106
14.0%
u1204
8.0%
r903
 
6.0%
n902
 
6.0%
s902
 
6.0%
e901
 
6.0%
T602
 
4.0%
S602
 
4.0%
Other values (7)2106
14.0%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

datumM01ABM01AEN02BAN02BEN05BN05CR03R06YearMonthHourWeekday Name
01/2/20140.003.673.432.407.00.00.02.020141248Thursday
11/3/20148.004.004.450.6016.00.020.04.020141276Friday
21/4/20142.001.006.561.8510.00.09.01.020141276Saturday
31/5/20144.003.007.041.108.00.03.00.020141276Sunday
41/6/20145.001.004.521.7016.02.06.02.020141276Monday
51/7/20140.000.000.00.000.00.00.00.020141276Tuesday
61/8/20145.333.0010.526.4019.01.010.00.020141276Wednesday
71/9/20147.001.688.025.0016.00.03.02.020141276Thursday
81/10/20145.002.002.053.3015.02.00.02.020141276Friday
91/11/20145.004.3410.452.3014.00.01.00.220141276Saturday

Last rows

datumM01ABM01AEN02BAN02BEN05BN05CR03R06YearMonthHourWeekday Name
20969/29/20193.513.8673.0067.806.00.03.02.1020199276Sunday
20979/30/20192.001.4392.1049.409.00.05.02.0020199276Monday
209810/1/201911.342.4060.1047.0015.04.017.01.50201910276Tuesday
209910/2/20195.183.2742.8030.209.01.00.01.10201910276Wednesday
210010/3/20195.003.0004.0040.4010.00.02.02.00201910276Thursday
210110/4/20197.345.6832.2522.4513.00.01.01.00201910276Friday
210210/5/20193.845.0106.0025.407.00.00.00.33201910276Saturday
210310/6/20194.0011.6902.0034.606.00.05.04.20201910276Sunday
210410/7/20197.344.5073.0050.806.00.010.01.00201910276Monday
210510/8/20190.331.7300.5044.3020.02.02.00.00201910190Tuesday